Result diversification based on query-specific cluster ranking

نویسندگان

  • Jiyin He
  • Edgar Meij
  • Maarten de Rijke
چکیده

Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification is restricted to documents belonging to clusters that potentially contain a high percentage of relevant documents. Empirical results show that the proposed framework improves the performance of several existing diversification methods. The framework also gives rise to a simple yet effective cluster-based approach to result diversification that selects documents from different clusters to be included in a ranked list in a round robin fashion. We describe a set of experiments aimed at thoroughly analyzing the behavior of the two main components of the proposed diversification framework, ranking and selecting clusters for diversification. Both components have a crucial impact on the overall performance of our framework, but ranking clusters plays a more important role than selecting clusters. We also examine properties that clusters should have in order for our diversification framework to be effective. Most relevant documents should be contained in a small number of high-quality clusters, while there should be no dominantly large clusters. Also, documents from these high-quality clusters should have a diverse content. These properties are strongly correlated with the overall performance of the proposed diversification framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Result Diversification Based on Query-Specific Cluster Ranking (Abstract)

Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many potential facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification is restricted to documents belonging to a set of clusters that potentially contain a h...

متن کامل

Score and Rank Aggregation Methods for Explicit Search Result Diversification

Search result diversification is one of the key techniques to cope with the ambiguous and/or underspecified information needs of the web users. In the last few years, strategies that are based on the explicit knowledge of query aspects emerged as highly effective ways of diversifying the search results. Our contributions in this work are two-fold. First, we extensively evaluate the performance ...

متن کامل

Modelling efficient novelty-based search result diversification in metric spaces

a r t i c l e i n f o a b s t r a c t Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O (n 2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce th...

متن کامل

mNIR: Diversifying Search Results Based on a Mixture of Novelty, Intention and Relevance

Current search engines do not explicitly take different meanings and usages of user queries into consideration when they rank the search results. As a result, they tend to retrieve results that cover the most popular meanings or usages of the query. Consequently, users who want results that cover a rare meaning or usage of query or results that cover all different meanings/usages may have to go...

متن کامل

Building a Microblog Corpus for Search Result Diversification

Queries that users pose to search engines are often ambiguous either because different users express different query intents with the same query terms or because the query is underspecified and it is unclear which aspect of a particular query the user is interested in. In the Web search setting, search result diversification, whose goal is the creation of a search result ranking covering a rang...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 62  شماره 

صفحات  -

تاریخ انتشار 2011